在本说明书中,我们介绍了众所周知的椭圆形潜在的引理的一般版本,这是一种广泛使用的技术在分析顺序学习和决策问题中的算法中。我们考虑一个随机线性匪徒设置,其中决策者在一组给定的行动中顺序选择,观察他们的嘈杂奖励,并旨在通过决策地平线最大化她的累积预期奖励。椭圆潜力引理是一种用于量化奖励功能参数的不确定性的关键工具,但它需要噪声和现有的分布成为高斯。我们的一般椭圆潜力引理放松了这种高斯要求,这是一种非常非琐碎的延伸,原因如上所述;与高斯案例不同,对后部分布的协方差矩阵没有闭合形式解决方案,协方差矩阵不是动作的确定性函数,并且协方差矩阵对于SEMIDEFINITE不等式而不是降低。虽然这一结果具有广泛的兴趣,但我们展示了它的应用,以证明具有在随机线性匪徒中的众所周知的汤普森采样算法的改进的贝叶斯遗憾,其中具有先前和噪声分布的改变动作集。这界限最多是常量的最佳状态。
translated by 谷歌翻译
In this paper, we study the trace regression when a matrix of parameters B* is estimated via the convex relaxation of a rank-regularized regression or via regularized non-convex optimization. It is known that these estimators satisfy near-optimal error bounds under assumptions on the rank, coherence, and spikiness of B*. We start by introducing a general notion of spikiness for B* that provides a generic recipe to prove the restricted strong convexity of the sampling operator of the trace regression and obtain near-optimal and non-asymptotic error bounds for the estimation error. Similar to the existing literature, these results require the regularization parameter to be above a certain theory-inspired threshold that depends on observation noise that may be unknown in practice. Next, we extend the error bounds to cases where the regularization parameter is chosen via cross-validation. This result is significant in that existing theoretical results on cross-validated estimators (Kale et al., 2011; Kumar et al., 2013; Abou-Moustafa and Szepesvari, 2017) do not apply to our setting since the estimators we study are not known to satisfy their required notion of stability. Finally, using simulations on synthetic and real data, we show that the cross-validated estimator selects a near-optimal penalty parameter and outperforms the theory-inspired approach of selecting the parameter.
translated by 谷歌翻译
Graph neural networks (GNNs) have been utilized for various natural language processing (NLP) tasks lately. The ability to encode corpus-wide features in graph representation made GNN models popular in various tasks such as document classification. One major shortcoming of such models is that they mainly work on homogeneous graphs, while representing text datasets as graphs requires several node types which leads to a heterogeneous schema. In this paper, we propose a transductive hybrid approach composed of an unsupervised node representation learning model followed by a node classification/edge prediction model. The proposed model is capable of processing heterogeneous graphs to produce unified node embeddings which are then utilized for node classification or link prediction as the downstream task. The proposed model is developed to classify stock market technical analysis reports, which to our knowledge is the first work in this domain. Experiments, which are carried away using a constructed dataset, demonstrate the ability of the model in embedding extraction and the downstream tasks.
translated by 谷歌翻译
In this paper, we propose a robust election simulation model and independently developed election anomaly detection algorithm that demonstrates the simulation's utility. The simulation generates artificial elections with similar properties and trends as elections from the real world, while giving users control and knowledge over all the important components of the elections. We generate a clean election results dataset without fraud as well as datasets with varying degrees of fraud. We then measure how well the algorithm is able to successfully detect the level of fraud present. The algorithm determines how similar actual election results are as compared to the predicted results from polling and a regression model of other regions that have similar demographics. We use k-means to partition electoral regions into clusters such that demographic homogeneity is maximized among clusters. We then use a novelty detection algorithm implemented as a one-class Support Vector Machine where the clean data is provided in the form of polling predictions and regression predictions. The regression predictions are built from the actual data in such a way that the data supervises itself. We show both the effectiveness of the simulation technique and the machine learning model in its success in identifying fraudulent regions.
translated by 谷歌翻译
This letter explains an algorithm for finding a set of base functions. The method aims to capture the leading behavior of the dataset in terms of a few base functions. Implementation of the A-star search will help find these functions, while the gradient descent optimizes the parameters of the functions at each search step. We will show the resulting plots to compare the extrapolation with the unseen data.
translated by 谷歌翻译
共处的触觉传感是一种基本的启发技术,用于灵巧操纵。然而,可变形的传感器在机器人,握住的对象和环境之间引入了复杂的动力学,必须考虑进行精细操纵。在这里,我们提出了一种学习软触觉传感器膜动力学的方法,该动力学解释了由握把对象和环境之间的物理相互作用引起的传感器变形。我们的方法将膜的感知3D几何形状与本体感受反应扳手结合在一起,以预测以机器人作用为条件的未来变形。从膜的几何形状和反应扳手中回收了抓握的物体姿势,从触觉观察模型中解耦相互作用动力学。我们在两个现实世界的接触任务上基准了我们的方法:用握把标记和手中旋转的绘画。我们的结果表明,明确建模膜动力学比基准实现了更好的任务性能和对看不见的对象的概括。
translated by 谷歌翻译
在本文中,我们提出了一种算法,以在动态场景的两对图像之间插值。尽管在过去的几年中,在框架插值方面取得了重大进展,但当前的方法无法处理具有亮度和照明变化的图像,即使很快将图像捕获也很常见。我们建议通过利用现有的光流方法来解决这个问题,这些方法对照明的变化非常健壮。具体而言,使用使用现有预训练的流动网络估算的双向流,我们预测了从中间帧到两个输入图像的流。为此,我们建议将双向流编码为由超网络提供动力的基于坐标的网络,以获得跨时间的连续表示流。一旦获得了估计的流,我们就会在现有的混合网络中使用它们来获得最终的中间帧。通过广泛的实验,我们证明我们的方法能够比最新的框架插值算法产生明显更好的结果。
translated by 谷歌翻译
检测和避免(DAA)功能对于无人飞机系统(UAS)的安全操作至关重要。本文介绍了Airtrack,这是一个仅实时视觉检测和跟踪框架,尊重SUAS系统的大小,重量和功率(交换)约束。鉴于遥远飞机的低信噪比(SNR),我们建议在深度学习框架中使用完整的分辨率图像,以对齐连续的图像以消除自我动态。然后,对齐的图像在级联的初级和次级分类器中下游使用,以改善多个指标的检测和跟踪性能。我们表明,Airtrack在亚马逊机载对象跟踪(AOT)数据集上胜过最先进的基线。多次现实世界的飞行测试与CESSNA 172与通用航空交通相互作用,并在受控的设置中朝着UAS飞向UAS的其他近碰撞飞行测试,该拟议方法满足了新引入的ASTM F3442/F3442M标准DAA标准。经验评估表明,我们的系统的概率超过900m,范围超过95%。视频可在https://youtu.be/h3ll_wjxjpw上找到。
translated by 谷歌翻译
深入强化学习(DRL)用于开发自主优化和定制设计的热处理过程,这些过程既对微观结构敏感又节能。与常规监督的机器学习不同,DRL不仅依赖于数据中的静态神经网络培训,但是学习代理人会根据奖励和惩罚元素自主开发最佳解决方案,并减少或没有监督。在我们的方法中,依赖温度的艾伦 - 卡恩模型用于相转换,用作DRL代理的环境,是其获得经验并采取自主决策的模型世界。 DRL算法的试剂正在控制系统的温度,作为用于合金热处理的模型炉。根据所需的相位微观结构为代理定义了微观结构目标。训练后,代理可以为各种初始微观结构状态生成温度时间曲线,以达到最终所需的微观结构状态。详细研究了代理商的性能和热处理概况的物理含义。特别是,该试剂能够控制温度以从各种初始条件开始达到所需的微观结构。代理在处理各种条件方面的这种能力为使用这种方法铺平了道路,也用于回收的导向热处理过程设计,由于杂质的侵入,初始组合物可能因批量而异,以及用于设计节能热处理。为了检验这一假设,将无罚款的代理人与考虑能源成本的代理人进行了比较。对能源成本的罚款是针对找到最佳温度时间剖面的代理的附加标准。
translated by 谷歌翻译
神经网络是众多远期过程的强大代孕。这种代理人的反转在科学和工程中非常有价值。成功的神经反向方法的最重要属性是在现实世界中(即在本地远期过程(不仅是学识渊博的替代)中部署在现实世界中时的解决方案的性能。我们建议自动化,这是一种高度自动化的神经网络代理的方法。我们的主要见解是在可靠数据附近寻求反向解决方案,这些解决方案已被取样形式,并用于训练替代模型。自动信息通过考虑替代物的预测不确定性并在反转过程中最小化,从而找到了这种解决方案。除了高精度外,自动验证液可以实现溶液的可行性,并带有嵌入式正规化,并且不含初始化。我们通过解决控制,制造和设计中的一系列现实世界问题来验证我们的方法。
translated by 谷歌翻译